Code you used to create, initialize, and push a portfolio repo to GitHub:
git init git add . git commit -m “First commit” git remote add origin https://remote_repository_URL git remote -v git push -u origin master
To push my portfolio repo to GitHub for updates:
git add . git commit -m “First commit” git push
title: “Pretty_html” author: “Alison Fong 33399149” date: “version February 16, 2018” output: html_document: toc: yes — #R Markdown PDF Challenge The following assignment is an exercise for the reproduction of this .html document using the RStudio and RMarkdown tools we’ve shown you in class. Hopefully by the end of this, you won’t feel at all the way this poor PhD student does. We’re here to help, and when it comes to R, the internet is a really valuable resource. This open-source program has all kinds of tutorials online.
http://phdcomics.com/ Comic posted 1-17-2018
The goal of this R Markdown html challenge is to give you an opportunity to play with a bunch of different RMarkdown formatting. Consider it a chance to flex your RMarkdown muscles. Your goal is to write your own RMarkdown that rebuilds this html document as close to the original as possible. So, yes, this means you get to copy my irreverant tone exactly in your own Markdowns. It’s a little window into my psyche. Enjoy =)
hint: go to the PhD Comics website to see if you can find the image above
If you can’t find that exact image, just find a comparable image from the PhD Comics website and include it in your markdown
Let’s be honest, this header is a little arbitrary. But show me that you can reproduce headers with different levels please. This is a level 3 header, for your reference (you can most easily tell this from the table of contents).
Perhaps you’re already really confused by the whole markdown thing. Maybe you’re so confused that you’ve forgotton how to add. Never fear! A calculator R is here:
1231521+12341556280987
## [1] 1.234156e+13
Or maybe, after you’ve added those numbers, you feel like it’s about time for a table!
I’m going to leave all the guts of the coding here so you can see how libraries (R packages) are loaded into R (more on that later). It’s not terribly pretty, but it hints at how R works and how you will use it in the future. The summary function used below is a nice data exploration function that you may use in the future.
library(knitr)
kable(summary(cars),caption="I made this table with kable in the knitr package library")
| speed | dist | |
|---|---|---|
| Min. : 4.0 | Min. : 2.00 | |
| 1st Qu.:12.0 | 1st Qu.: 26.00 | |
| Median :15.0 | Median : 36.00 | |
| Mean :15.4 | Mean : 42.98 | |
| 3rd Qu.:19.0 | 3rd Qu.: 56.00 | |
| Max. :25.0 | Max. :120.00 |
And now you’ve almost finished your first RMarkdown! Feeling excited? We are! In fact, we’re so excited that maybe we need a big finale eh? Here’s ours! Include a fun gif of your choice!
YAAAAY
library(tidyverse)
## ── Attaching packages ──────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 2.2.1 ✔ purrr 0.2.4
## ✔ tibble 1.4.2 ✔ dplyr 0.7.4
## ✔ tidyr 0.8.0 ✔ stringr 1.2.0
## ✔ readr 1.1.1 ✔ forcats 0.2.0
## ── Conflicts ─────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(phyloseq)
library(dplyr)
metadata_new = read.table(file="Saanich.metadata.txt", header=TRUE, row.names=1, sep="\t", na.strings="NAN")
OTU_new = read.table(file="Saanich.OTU.txt", header=TRUE, row.names=1, sep="\t", na.strings="NAN")
load("phyloseq_object.RData")
ggplot(metadata_new, aes(x=PO4_uM, y=Depth_m)) +
geom_point(shape=17, color="purple")
metadata_new %>%
select(matches("Temp"))
## Temperature_C
## SI072_S3_010 12.854
## SI072_S3_020 11.005
## SI072_S3_040 9.536
## SI072_S3_060 8.540
## SI072_S3_075 8.480
## SI072_S3_085 8.538
## SI072_S3_090 8.599
## SI072_S3_097 8.647
## SI072_S3_100 8.703
## SI072_S3_110 8.727
## SI072_S3_120 8.796
## SI072_S3_135 8.882
## SI072_S3_150 9.002
## SI072_S3_165 9.041
## SI072_S3_185 9.091
## SI072_S3_200 9.117
metadata_new %>%
mutate(Temperature_F = Temperature_C*1.8+32) %>%
ggplot() + geom_point(aes(y=Depth_m, x=Temperature_F))
physeq_percent = transform_sample_counts(physeq, function(x) 100 * x/sum(x))
plot_bar(physeq_percent, fill="Genus") +
geom_bar(aes(fill=Genus), stat="identity") + ggtitle("Genus Percentages") + xlab("Sample Depth") + ylab("Percent Relative Abundance")
table_5_1= metadata_new %>% select(Depth_m, O2_uM, PO4_uM, SiO2_uM, NO3_uM, NH4_uM, NO2_uM)
table_5_2= table_5_1 %>% gather (Nutrients, Concentration, O2_uM, PO4_uM, SiO2_uM, NO3_uM, NH4_uM, NO2_uM)
ggplot(table_5_2, aes(x=Depth_m, y=Concentration)) +
geom_point() + geom_line()+ facet_wrap(~Nutrients, scales="free_y") +
theme(legend.position="none")
Describe the numerical abundance of microbial life in relation to ecology and biogeochemistry of Earth systems.
What were the main questions being asked?
What is the number of prokaryotes and the total amount of their cellular carbon on Earth?
Cellular production rate for all prokaryotes on Earth is estimated to be 1.7 x 10^30 cells per year; highest in open ocean
Do new questions arise from the results?
How do carbon content in prokaryotes interact with carbon content from the environment? How is carbon from relatively inaccessible sources cycled through the carbon cycle?
I was a little confused on the schematics of how different habitats/depths of the earth are linked together; it may be helpful to provide a depth diagram or a habitat diagram to show exactly which areas of the earth we are talking about/calculating for
Describe the numerical abundance of microbial life in relation to the ecology and biogeochemistry of Earth systems.
What are the primary prokaryotic habitats on Earth and how do they vary with respect to their capacity to support life? Provide a breakdown of total cell abundance for each primary habitat from the tables provided in the text. From Table 5: Aquatic habitats - 1.2 x 10^29 cells (From Table 1, large population ≠ cell density) Oceanic subsurface - 3.55 x 10^30 cells Soil - 2.6 x 10^29 cells Terrestrial subsurface - 25-250 x 10^28 cells
What is the estimated prokaryotic cell abundance in the upper 200 m of the ocean and what fraction of this biomass is represented by marine cyanobacterium including Prochlorococcus? What is the significance of this ratio with respect to carbon cycling in the ocean and the atmospheric composition of the Earth? Estimated prokaryotic cell abundance in upper 200 m = 3.60 x 10^28 Upper 200 m - cellular density ~ 5 x 10^5 cells/ml Proclorococcus - celluar density ~ 4 x 10^4 cells/ml (4x104)/(5x105) x 100 = 8% Prochlorococcus = there must be high turnover of Prochlorococcus in order to support the carbon that’s cycling in the ocean b/c only 8% of prokaryotic cells in the upper 200 m of the ocean are Prochlorococcus; Prochlorococcus is the main source of carbon in the ocean
What is the difference between an autotroph, heterotroph, and a lithotroph based on information provided in the text?
Autotroph - uses inorganic carbon as carbon source (incorporated into their own cells, not just used as part of metabolism; Ex. CO2); self-nourishing; fix inorganic carbon (CO2) into biomass Heterotroph - uses organic carbon as carbon source (incorporated into their own cells, not just used as part of metabolism); assimilate organic carbon Lithotroph - uses inorganic chemicals (i.e. minerals, irons) as e- source; use inorganic substrates
Based on information provided in the text and your knowledge of geography what is the deepest habitat capable of supporting prokaryotic life? What is the primary limiting factor at this depth?
Deep habitats supporting life: Subsurface —> terrestrial = 4 km; marine = deepest subsurface is 9-10 km from sea level Primary limiting factor = temperature (avg temperature at this depth is 125˚C, which is the close to the upper temperature limit for prokaryotic life); ∆˚C ~ 22˚C/km
Based on information provided in the text your knowledge of geography what is the highest habitat capable of supporting prokaryotic life? What is the primary limiting factor at this height?
Atmospheric —> 57-77 km above sea level (realistic boundary = 20 km above sea level) Primary limiting factor = nutrients and temperature; very humid with UV ionization in upper atmosphere
Based on estimates of prokaryotic habitat limitation, what is the vertical distance of the Earth’s biosphere measured in km? Assuming top of Mt. Everest as the top boundary and 4km below subsurface as the lower boundary = 8.8 km + 10 km + 4km =~ 23 km
How was annual cellular production of prokaryotes described in Table 7 column four determined? (Provide an example of the calculation)
Population size (divided by) turnover time days/(365 days/yr) = cells/year (3.6x10^28)/(16/365) = 8.2 x 10^29 Viruses carry accessory metabolic genes —> protein encoding genes that play a role in cytometabolism (not just there for viral replication, but can influence metabolic network within cell and essentially reprogram the cell) —> cells are information circuit boards that can actually be reprogrammed by viruses!
What is the relationship between carbon content, carbon assimilation efficiency and turnover rates in the upper 200m of the ocean? Why does this vary with depth in the ocean and between terrestrial and marine habitats?
Carbon assimilation efficiency is assumed to be 0.2 (or 20%) in the paper (held constant) —> amount of “net productivity” necessary to support turnover of prokaryotes in the upper 200 m of the ocean is 4 times their carbon content or 0.7-2.9 Pg of C —> assuming 85% of net productivity is consumed in the upper 200 m and assuming all this carbon is used by prokaryotes, average turnover rate cannot exceed 15-60 yr^-1 To calculate, we need to know total # of cells and the total # of C/cell Total C/cell = 20 fg C/cell —> average = 10 fg C/cell = 10^-30 Pg/cell Total # of cells = 3.6 x 10^28 cells 3.6x10^28 cells x 10^-30 Pg/cell = 0.72 Pg C in marine heterotrophs Used a multiplier of 4 in the paper —> 4 x 0.72 = 2.88 Pg C/year (that’s the turnover rate of C) 51 Pg C/year —> 85% is consumed, thats ~ 43 Pg C/year (43 Pg C/year)/(2.88 Pg C/year) = 14.9 or 1 turnover every 24.5 days
How were the frequency numbers for four simultaneous mutations in shared genes determined for marine heterotrophs and marine autotrophs given an average mutation rate of 4 x 10-7 per DNA replication? (Provide an example of the calculation with units. Hint: cell and generation cancel out) 4x10^-7 mutations/generation Take it to the power of 4 (for 4 simultaneous mutations) —> 2.56 x 10^-26 mutations/generation We need to know the turnover rate (how quickly the cells generate themselves; how many generations per year?) —> 3.6x10^28 cells; 365 days/16 days (this is the turnover rate) = 22.8 turnovers/year (3.6x10^28 cels/year) x 22.8 = 8.2x10^29 cells/year in ocean (8.2x10^29 cells/year) x (2.56x10^-26 mutations/generation) = 2.1x10^4 mutations/year = 0.4 hours/mutation 4 mutations simultaneously is rare , but this calculation shows that this is still occurring frequently (point mutations) — whole other mobile aspect to microbial genome that is drastically more rapid than even this background mutation rate; when you have large pop’l sizes, almost anything is possible
Given the large population size and high mutation rate of prokaryotic cells, what are the implications with respect to genetic diversity and adaptive potential? Are point mutations the only way in which microbial genomes diversify and adapt?
What relationships can be inferred between prokaryotic abundance, diversity, and metabolic potential based on the information provided in the text?
Comment on the emergence of microbial life and the evolution of Earth systems
Indicate the key events in the evolution of Earth systems at each approximate moment in the time series. If times need to be adjusted or added to the timeline to fully account for the development of Earth systems, please do so.
4.6 billion years ago
4.2 billion years ago
3.8 billion years ago
3.75 billion years ago
3.5 billion years ago
3.0 billion years ago
2.7 billion years ago
2.2 billion years ago
2.1 billion years ago
1.3 billion years ago
550,000 years ago
200,000 years ago
Describe the dominant physical and chemical characteristics of Earth systems at the following waypoints:
Hadean
Archean
Precambrian
Proterozoic
Phanerozoic
Discuss the role of microbial diversity and formation of coupled metabolism in driving global biogeochemical cycles.
What are the primary geophysical and biogeochemical processes that create and sustain conditions for life on Earth? How do abiotic versus biotic processes vary with respect to matter and energy transformation and how are they interconnected?
Why is Earth’s redox state considered an emergent property?
How do reversible electron transfer reactions give rise to element and nutrient cycles at different ecological scales? What strategies do microbes use to overcome thermodynamic barriers to reversible electron flow?
Using information provided in the text, describe how the nitrogen cycle partitions between different redox “niches” and microbial groups. Is there a relationship between the nitrogen cycle and climate change?
What is the relationship between microbial diversity and metabolic diversity and how does this relate to the discovery of new protein families from microbial community genomes?
On what basis do the authors consider microbes the guardians of metabolism?
Whitman WB, Coleman DC, and Wiebe WJ. 1998. Prokaryotes: The unseen majority. Proc Natl Acad Sci USA. 95(12):6578–6583. PMC33863